CCBR-620

Maggie Cam

Oct 15, 2015

Background

To detect different gene expression in a small subpopulation (<1%) of skin, call Merkel cell, upon shh, fgf20, and edar mutation. Aim to analyze the connection between the three different pathways, and try to get clue of the genes participating in Merkel cell development.

E15.5 embryo skin from 2 mutant lines:

edar mutant = mutation in gene ectodysplasin-A receptor, Single point mutation (Mutation details: This allele involves a G to A transition mutation at nucleotide 1,135 that causes the amino acid change: glutamate to lysine at position 379 (E379K). (J:56496))
phenotype (http://www.informatics.jax.org/allele/allgenoviews/MGI:1856018) = abnormal coat/hair morphology, darkened coat color

fgf20 mutant = fgf20-β-galactosidase knock-in allele phenotype= no guard hair in adult mouse back skin

Data Processing

## Bioconductor version 3.2 (BiocInstaller 1.20.0), ?biocLite for help

This part of the program was run on biowulf, and data transferred to a local directory

#Command line version
module load subread
x=$(ls *.bam)
featureCounts -p -T 8 -s 2 -p -t exon -g gene_id -a /data/maggiec/RNASeq/Genomes/mm10/gencode.vM4.all.gtf -o counts_ss.txt $x

#Used R version:
gtf="data/maggiec/RNASeq/Genomes/mm10/gencode.vM4.all.gtf"
targets <- readTargets()
fc <- featureCounts(files=targets$bam,isGTFAnnotationFile=TRUE,nthreads=32,
      annot.ext=gtf,GTF.attrType="gene_name",strandSpecific=2,isPairedEnd=TRUE)
x <- DGEList(counts=fc$counts, genes=fc$annotation)

Load data from local directory:

setwd("/Users/maggiec/GitHub/Maggie/ccbr620/")
load("data/ssData.RData")
fc=fc_ss
#load("data/RNASeqData.RData")
ls()
## [1] "fc"      "fc_ss"   "targets"
fc$stat
##                        Status Sample_e10.bam Sample_e1.bam Sample_e2.bam
## 1                    Assigned       21121623      21175997      20813400
## 2        Unassigned_Ambiguity         237085        235010        234629
## 3     Unassigned_MultiMapping        3884051       3736163       3913784
## 4       Unassigned_NoFeatures        1634087       1608372       1925831
## 5         Unassigned_Unmapped              0             0             0
## 6   Unassigned_MappingQuality              0             0             0
## 7  Unassigned_FragementLength              0             0             0
## 8          Unassigned_Chimera              0             0             0
## 9        Unassigned_Secondary              0             0             0
## 10     Unassigned_Nonjunction              0             0             0
## 11       Unassigned_Duplicate              0             0             0
##    Sample_e4.bam Sample_e5.bam Sample_e9.bam Sample_f1.bam Sample_f2.bam
## 1       20614852      20837085      20879873      20541976      20644725
## 2         240383        237921        239807        232609        237351
## 3        4046417       4084228       4123848       3576274       3803101
## 4        2036974       1824097       1759145       2282551       2103056
## 5              0             0             0             0             0
## 6              0             0             0             0             0
## 7              0             0             0             0             0
## 8              0             0             0             0             0
## 9              0             0             0             0             0
## 10             0             0             0             0             0
## 11             0             0             0             0             0
##    Sample_f4.bam Sample_f5.bam Sample_f6.bam Sample_f7.bam
## 1       20773860      20443575      20472955      20641102
## 2         226857        231031        227117        233980
## 3        3820429       3654785       3566694       3639302
## 4        1941022       2323502       2342071       2156703
## 5              0             0             0             0
## 6              0             0             0             0
## 7              0             0             0             0
## 8              0             0             0             0
## 9              0             0             0             0
## 10             0             0             0             0
## 11             0             0             0             0
fc1=mat=fc$counts
tfc1=t(fc1)
filter <- apply(fc1, 1, function(x) length(x[x>5])>=1)
fc1filt <- fc1[filter,]
genes <- rownames(fc1filt)

QC Check: Look at raw signal distribution and median expression levels

Data is normalized by TMM and quantile normalization

## [1] 16321    12

Run Voom

#Do analysis for entire group
celltype <- factor(targets$Phenotype)
Batch <- factor(targets$Batch)
cell_rep=paste(celltype,Batch,sep=".")
design <- model.matrix(~0+celltype)
design
##    celltypeedar/+ control celltypeedar/edar mutant
## 1                       1                        0
## 2                       1                        0
## 3                       1                        0
## 4                       0                        1
## 5                       0                        1
## 6                       0                        1
## 7                       0                        0
## 8                       0                        0
## 9                       0                        0
## 10                      0                        0
## 11                      0                        0
## 12                      0                        0
##    celltypefgf20lz/+ control celltypefgf20lz/lz mutant
## 1                          0                         0
## 2                          0                         0
## 3                          0                         0
## 4                          0                         0
## 5                          0                         0
## 6                          0                         0
## 7                          1                         0
## 8                          1                         0
## 9                          0                         1
## 10                         1                         0
## 11                         0                         1
## 12                         0                         1
## attr(,"assign")
## [1] 1 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$celltype
## [1] "contr.treatment"
v <- voom(x,design,plot=TRUE,normalize="quantile")

After Normalization

## Using  as id variables

unnamed_chunk_6snapshot
You must enable Javascript to view this page properly.

Preprocessing of edar group (voom): 2 batches of pups (voom)

##              bam        Phenotype Batch
## 1 Sample_e10.bam   edar/+ control     1
## 2  Sample_e1.bam   edar/+ control     2
## 3  Sample_e2.bam   edar/+ control     2
## 4  Sample_e4.bam edar/edar mutant     2
## 5  Sample_e5.bam edar/edar mutant     2
## 6  Sample_e9.bam edar/edar mutant     1
##   (Intercept) Batch2 celltypeedar/edar mutant
## 1           1      0                        0
## 2           1      1                        0
## 3           1      1                        0
## 4           1      1                        1
## 5           1      1                        1
## 6           1      0                        1
## attr(,"assign")
## [1] 0 1 2
## attr(,"contrasts")
## attr(,"contrasts")$Batch
## [1] "contr.treatment"
## 
## attr(,"contrasts")$celltype
## [1] "contr.treatment"

unnamed_chunk_7snapshot
You must enable Javascript to view this page properly.

Preprocessing of Fgf group (voom): 1 batch of pups

##              bam         Phenotype Batch
## 7  Sample_f1.bam fgf20lz/+ control     3
## 8  Sample_f2.bam fgf20lz/+ control     3
## 9  Sample_f4.bam fgf20lz/lz mutant     3
## 10 Sample_f5.bam fgf20lz/+ control     3
## 11 Sample_f6.bam fgf20lz/lz mutant     3
## 12 Sample_f7.bam fgf20lz/lz mutant     3
##   (Intercept) celltypefgf20lz/lz mutant
## 1           1                         0
## 2           1                         0
## 3           1                         1
## 4           1                         0
## 5           1                         1
## 6           1                         1
## attr(,"assign")
## [1] 0 1
## attr(,"contrasts")
## attr(,"contrasts")$celltype
## [1] "contr.treatment"

unnamed_chunk_8snapshot
You must enable Javascript to view this page properly.

Genes of Interest

####Statistical Analysis of Edar group (lmfit)

#Run analysis of edar group:
fit1 <- lmFit(v1,design1) 
fit1 <- eBayes(fit1) 
options(digits=3) 
topgenes1=topTable(fit1,coef=3,n=50,sort.by="p")
FC = 2^(fit1$coefficients[,3])
FC = ifelse(FC<1,-1/FC,FC)
finalres=topTable(fit1,coef=3,sort="none",n=Inf)

Volcano Plot: Edar Group

Heatmap: Edar Group Before and after Batch Removal

Barchart: Edar Group Before and after Batch Removal

PCA Plot: After Batch Removal

unnamed_chunk_14snapshot
You must enable Javascript to view this page properly.

Statistical Analysis of Fgf group

design2 <- model.matrix(~celltype)
fit2 <- lmFit(v2,design2) 
fit2 <- eBayes(fit2) 
options(digits=3) 
topgenes2=topTable(fit2,coef=2,n=50,sort.by="p")
finalres2=topTable(fit2,coef=2,sort="none",n=Inf)

FC2 = 2^(fit2$coefficients[,2])
FC2 = ifelse(FC2<1,-1/FC2,FC2)

Volcano Plot: Fgf Group

Heatmap: Fgf Group

Boxplots: Fgf Group